HTML-based scraping typically requires downloading the full web page, including the `
` and ``, and often waiting for JavaScript to render dynamic content. This can limit concurrency (e.g., to 4 pages at a time) to avoid triggering rate limits or being blocked. In contrast, FTP allows direct access to files and supports parallel downloads (e.g., 10 or more files concurrently), making it a faster and more scalable option when available. --- ## 🕳️ Nonstandard Protocol & Legacy Tech Evasion Tactics ### 🧾 **1. HTML4 (and "minimalist" HTML)** **Why it can help:** - Many modern bot detectors rely on **JavaScript execution, fingerprinting APIs, and dynamic content detection**. - An HTML4 or JS-free crawler **avoids triggering those scripts altogether**. - No `